perm filename 0[0,BGB]3 blob
sn#028572 filedate 1973-03-14 generic text, type T, neo UTF8
00100 DRAFT THESIS OUTLINE. DECEMBER 1972
00200
00300 GEOMETRIC VISION
00400 - draft thesis outline -
00500
00600 B. G. Baumgart
00700
00800
00900 ABSTRACT:
01000
01100 This thesis is about a computer vision system based on a
01200 geometric model of the objects being viewed. In
01300 principle, this vision system is simply a process that can be applied
01400 to a reel of video tape to compute blueprints and geodetic maps.
01500 Applications of this system to object recognition, scene analysis and
01600 robot vehicle control are demonstrated.
01700
01800
01900 CONTENTS:
02000
02100 I. MEMORY.
02200
02300 A. Representation of a Geometric Mental Universe.
02400 B. Region-Edge Image Representation.
02500 C. Semantic, Feature and Predicate Representation.
02600
02700 II. PROCESS.
02800
02900 A. Image Prediction.
03000 B. Image Perception.
03100 C. Image Comparison.
03200 D. Camera Locus Solution.
03300 E. World Model Modification.
03400 1. delete object from map.
03500 2. add known object to map. (recognition).
03600 3. add or alter object in dictionary.
03700
03800 III. APPLICATION.
03900
04000 A. Blocks and Block Scenes.
04100 1. deletion of a block from a scene.
04200 2. addition of blocks to a scene.
04300 B. Tools and Table Top Scenes.
04400 1. complicated object perception.
04500 2. known object recognition.
04600 C. A Robot Vehicle and Outdoor Scenes.
04700 1. known road servoing.
04800 2. landscape perception.
00100 I. MEMORY STRUCTURE.
00200
00300 In order to get a computer to deal with the physical world it
00400 must have a data representation on which computations involving
00500 space, time, shape, size and the appearance of things can be done. In
00600 this section, a representation for the topology, geometry and
00700 photometry of everyday things is explained. The data
00800 structures discussed are implemented as small blocks of words
00900 containing pointers and data in the fashion usual to graphics and
01000 simulation; an introduction to this technology can be found in Knuth
01100 [1]; and although the language of implementation is PDP-10 machine
01200 code, the data and functions presented below are accessible from
01300 higher level languages like LISP and ALGOL.
01400
01500 I.A. Representation of a Geometric Mental Universe.
01600
01700 At the top of the data structure is a single universe node
01800 from which everything else can be reached. Immediately below the
01900 universe node is a ring of world models. A robot dealing with
02000 physical world sensor input, such as video data, has one of its world
02100 models dedicated to simulating the immediate here and now; this
02200 mental world is called the reality world model. In addition to the
02300 reality world, a robot may have fantasy world models for problem
02400 solving, planning or for recalling platonic object prototypes. In the
02500 following, a two world mental universe will be the most common, with
02600 the reality world being referred to as a "map" and the fantasy world
02700 being referred to as a "dictionary".
02800
02900 Geometric world models have four basic kinds of nodes:
03000 body, face, edge and vertex. The face, edge and vertex nodes are used
03100 to form polyhedrons which may be attached to body nodes. Body nodes
03200 in turn are connected to each other in rings and trees to form a
03300 world model. Additional kinds of nodes discribe cameras and light
03400 sources as well as temporary data such as shadows, spines, and
03500 trajectories.
03600
03700 ...continuation of this section follows AIM-179,
03800 "Winged Edge Polyhedron Representation" - Baumgart.
00100 I.B. Region-Edge Image Representation.
00200
00300 The image data structure presented in this section is a
00400 computer's internal notation for what is vulgarly called a line
00500 drawing; the common term is misleading because it does not suggest
00600 the equally important space between the lines; terms closer to the
00700 idea would be "mosaic drawing" or "stained glass window drawing".
00800
00900 The data structure has main levels: TV raster, video
01000 intensity contour, arc contour, and region-edge.
01100 ...continuation of this section follows SAILON-71,
01200 "CART'S EYE THREE and its IMAGE REPRESENTATION" - Baumgart.
01300
01400
00100 II. PROCESS.
00200
00300 A. Image Prediction.
00400 B. Image Perception.
00500 C. Image Comparison.
00600 D. Camera Locus Solution.
00700 E. World Model Modification.
00800 1. delete object from map.
00900 2. add known object to map. (recognition).
01000 3. add or alter object in dictionary.
01100
01200 III. APPLICATION.
01300
01400 A. Block Scenes.
01500 1. deletion of a block from a scene.
01600 2. addition of blocks to a scene.
01700 B. Tools and things.
01800 1. complicated object perception.
01900 2. known object recognition.
02000 C. Robot Vehicle.
02100 1. known road servoing.
02200 2. landscape perception.